1 Exercise 01

Produce forecasts for the following series using whichever of NAIVE(y), SNAIVE(y) or RW(y ~ drift()) is more appropriate in each case:

  • Australian Population (global_economy)
  • Bricks (aus_production)
  • NSW Lambs (aus_livestock)
  • Household wealth (hh_budget).
  • Australian takeaway food turnover (aus_retail).

1.1 Australian Population (global_economy)

1.2 Bricks (aus_production)

1.3 NSW Lambs (aus_livestock)

1.4 Household wealth (hh_budget).

1.5 Australian takeaway food turnover (aus_retail).

2 Exercise 02

Use the Facebook stock price (data set gafa_stock) to do the following:

  • Produce a time plot of the series.
  • Produce forecasts using the drift method and plot them.
  • Show that the forecasts are identical to extending the line drawn between the first and last observations.
  • Try using some of the other benchmark functions to forecast the same data set. Which do you think is best? Why?

2.1 Time plot

We have a problem here: some dates are missing (that’s why we have xlabel = Date [!]).

I’m going to fix this with linear interpolation.

fb_interpolation <- fb_raw %>% 
    update_tsibble(index = Date, regular = TRUE) %>% 
    fill_gaps() %>% 
    select(Close) %>% 
    mutate(Close = approx(Date, Close, Date)$y)

autoplot(fb_interpolation, Close)

Other solution could have been to get the prices at the start of each week, per each year … ?

2.2 Forecast with drift method

fb_interpolation_fit <- fb_interpolation %>% 
    model(RW(Close ~ drift()))

 fb_interpolation_fit %>% 
    forecast(h = 100) %>% 
    autoplot(fb_interpolation) +
    labs(title = "Symbol = FB",
         subtitle = "Forecast with DRIFT method") +
    scale_x_date(date_breaks = "1 year",
                 date_labels = "%Y.")

2.3 Details of drift method

Show that the forecasts are identical to extending the line drawn between the first and last observations.

manual_drift_forecast <- function(h) {
    history_length <- length(fb_interpolation$Close)
    value_history_start <- fb_interpolation$Close[1]
    value_history_end <- fb_interpolation$Close[history_length]
    h_term <- (value_history_end - value_history_start) / (history_length - 1)
    return(value_history_end + h * h_term)
}

final_check <- fb_interpolation_fit %>% 
    forecast(h = 100) %>% 
    as_tibble() %>% 
    select(Date, .mean) %>% 
    purrr::set_names(c("Date", "DRIFT_MODEL")) %>% 
    mutate(h = seq_along(Date),
           DRIFT_MANUAL = sapply(h, manual_drift_forecast),
           DIFFERENCE = DRIFT_MANUAL - DRIFT_MODEL) 

final_check %>% 
    filter(DIFFERENCE != 0)
## # A tibble: 17 × 5
##    Date       DRIFT_MODEL     h DRIFT_MANUAL DIFFERENCE
##    <date>           <dbl> <int>        <dbl>      <dbl>
##  1 2019-01-13        132.    13         132.  -2.84e-14
##  2 2019-01-29        132.    29         132.  -2.84e-14
##  3 2019-02-03        133.    34         133.  -2.84e-14
##  4 2019-02-08        133.    39         133.  -2.84e-14
##  5 2019-02-19        133.    50         133.  -2.84e-14
##  6 2019-02-24        133.    55         133.  -2.84e-14
##  7 2019-03-01        134.    60         134.  -2.84e-14
##  8 2019-03-06        134.    65         134.  -2.84e-14
##  9 2019-03-12        134.    71         134.  -2.84e-14
## 10 2019-03-17        134.    76         134.  -2.84e-14
## 11 2019-03-22        134.    81         134.  -2.84e-14
## 12 2019-03-27        135.    86         135.  -2.84e-14
## 13 2019-03-28        135.    87         135.  -2.84e-14
## 14 2019-04-01        135.    91         135.  -2.84e-14
## 15 2019-04-02        135.    92         135.  -2.84e-14
## 16 2019-04-06        135.    96         135.  -2.84e-14
## 17 2019-04-07        135.    97         135.  -2.84e-14

Floating point differences, who cares …

2.4 Other methods

Try using some of the other benchmark functions to forecast the same data set. Which do you think is best? Why?

I still think drift method is best, since we don’t have any seasonality in these time series.

3 Exercise 03

Apply a seasonal naïve method to the quarterly Australian beer production data from 1992. Check if the residuals look like white noise, and plot the forecasts. The following code will help.

# Extract data of interest
recent_production <- aus_production %>%
  filter(year(Quarter) >= 1992)

# Define and estimate a model
fit <- recent_production %>% model(SNAIVE(Beer))

# Look at the residuals
fit %>% gg_tsresiduals()

Couple of points should be made:

We can confirm this by Shapiro–Wilk test:

augment(fit) %>% 
    pull(.fitted) %>% 
    shapiro.test(.)
## 
##  Shapiro-Wilk normality test
## 
## data:  .
## W = 0.86827, p-value = 2.667e-06

Nevertheless, forecasts will probably be quite good, but prediction intervals that are computed assuming a normal distribution may be inaccurate.

4 Exercise 04

Repeat the previous exercise using the Australian Exports series from global_economy and the Bricks series from aus_production. Use whichever of NAIVE() or SNAIVE() is more appropriate in each case.

4.1 Australian Exports series (global_economy)

We cannot use SNAIVE() here since we have data on year-by-year basis (there are no seasons to repeat). So, the only reasonbable method is NAIVE, though I would prefer here RW(y ~ drift()).

Now let’s see Paul Allen’s forecast method.

4.2 Bricks (aus_production)

IMO, NAIVE is better here, since SNAVE gives non-constant variance, right skewed distribution of residuals etc.

5 Example 05

Produce forecasts for the 7 Victorian series in aus_livestock using SNAIVE(). Plot the resulting forecasts including the historical data. Is this a reasonable benchmark for these series?

Yes, these are reasonable benchmarks.

Argument could be made for:

6 Example 06

Are the following statements true or false? Explain your answer.

7 Example 07

For your retail time series (from Exercise 8 in Section 2.10):

Create a training dataset consisting of observations before 2011 using:

set.seed(12345678)

myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

myseries_train <- myseries %>%
  filter(year(Month) < 2011)

Check that your data have been split appropriately by producing the following plot.

autoplot(myseries, Turnover) +
  autolayer(myseries_train, Turnover, colour = "red")

Fit a seasonal naïve model using SNAIVE() applied to your training data (myseries_train).

fit <- myseries_train %>%
  model(SNAIVE(Turnover))

Check the residuals.

fit %>% gg_tsresiduals()

No, the residuals do not appear to be uncorrelated and normally distributed.

Produce forecasts for the test data

fc <- fit %>%
  forecast(new_data = anti_join(myseries, myseries_train))

fc %>% autoplot(myseries)

Compare the accuracy of your forecasts against the actual values.

accuracy(fc, myseries)
## # A tibble: 1 × 12
##   .model     State Indus…¹ .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>      <chr> <chr>   <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 SNAIVE(Tu… Nort… Clothi… Test  0.836  1.55  1.24  5.94  9.06  1.36  1.28 0.601
## # … with abbreviated variable name ¹​Industry

The forecast on test data seems to be accurate.

How sensitive are the accuracy measures to the amount of training data used?

As we can see from the graph, there exists optimal point to the size of training set (72) that brings the errors to the minimum.

8 Example 08

Consider the number of pigs slaughtered in New South Wales (data set aus_livestock).

  • Produce some plots of the data in order to become familiar with it.
  • Create a training set of 486 observations, withholding a test set of 72 observations (6 years).
  • Try using various benchmark methods to forecast the training set and compare the results on the test set. Which method did best?
  • Check the residuals of your preferred method. Do they resemble white noise?

8.1 EDA

Let’s check the actual time series:

Seasonality is present, trend follows random walk.

Combined with this:

aus_pigs %>% 
    features(Count, feat_stl) %>% 
    transpose()
## [[1]]
## [[1]]$trend_strength
## [1] 0.950971
## 
## [[1]]$seasonal_strength_year
## [1] 0.6424557
## 
## [[1]]$seasonal_peak_year
## [1] 6
## 
## [[1]]$seasonal_trough_year
## [1] 7
## 
## [[1]]$spikiness
## [1] 9148187789
## 
## [[1]]$linearity
## [1] -49899.04
## 
## [[1]]$curvature
## [1] -445937.5
## 
## [[1]]$stl_e_acf1
## [1] -0.1929197
## 
## [[1]]$stl_e_acf10
## [1] 0.4107981

… we can say that:

  • trend_strength: The trend component is strong (meaning that trend component contains most of the variation compared to the variation of trend and remainder component).
  • seasonal_strength_year: Seasonal component is also strong, but not as much as trend component.
  • June contains the largest seasonal component.
  • July contains the smallest seasonal component.
  • linearity & curvature: trend is negative.

Create a training set of 486 observations, withholding a test set of 72 observations (6 years).

# time series split is from "preamble.R" -> custom function.

tt_split <- time_series_split(aus_pigs, 486)

tt_split
## $train
## # A tsibble: 486 x 2 [1M]
##       Month  Count
##       <mth>  <dbl>
##  1 1972 srp  97400
##  2 1972 kol 114700
##  3 1972 ruj 109900
##  4 1972 lis 108300
##  5 1972 stu 122200
##  6 1972 pro 106900
##  7 1973 sij  96600
##  8 1973 vlj  96700
##  9 1973 ožu 121200
## 10 1973 tra  99300
## # … with 476 more rows
## # ℹ Use `print(n = ...)` to see more rows
## 
## $test
## # A tsibble: 72 x 2 [1M]
##       Month Count
##       <mth> <dbl>
##  1 2013 sij 71300
##  2 2013 vlj 69700
##  3 2013 ožu 79900
##  4 2013 tra 74300
##  5 2013 svi 87200
##  6 2013 lip 71200
##  7 2013 srp 86200
##  8 2013 kol 78000
##  9 2013 ruj 71200
## 10 2013 lis 78200
## # … with 62 more rows
## # ℹ Use `print(n = ...)` to see more rows

Try using various benchmark methods to forecast the training set and compare the results on the test set. Which method did best?

Which method did best?

accuracy(fc, tt_split$test) %>% 
    arrange(RMSE)
## # A tibble: 4 × 10
##   .model   .type      ME   RMSE    MAE    MPE  MAPE  MASE RMSSE    ACF1
##   <chr>    <chr>   <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
## 1 F_DRIFT  Test   -4685.  8091.  6967.  -7.36  10.1   NaN   NaN  0.0785
## 2 F_NAIVE  Test   -6138.  8941.  7840.  -9.39  11.4   NaN   NaN  0.0545
## 3 F_SNAIVE Test   -9204. 11802. 10312. -13.7   15.0   NaN   NaN -0.0629
## 4 F_MEAN   Test  -39360. 39894. 39360. -55.9   55.9   NaN   NaN  0.0545

Drift method did best.

Check the residuals of your preferred method. Do they resemble white noise?

fit %>% 
    select(F_DRIFT) %>% 
    gg_tsresiduals()

Innovation residuals do resemble white noise, but ACF and residual distribution suggest that calculated confidence interval will most likely not be correct.

9 Example 09

Create a training set for household wealth (hh_budget) by withholding the last four years as a test set.

tt_split <- time_series_split(
    dataset = aus_wealth,
    train_size = nrow(aus_wealth) - 4
)

Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.

fit <- tt_split$train %>% 
    model(
        F_MEAN = MEAN(Wealth),
        F_NAIVE = NAIVE(Wealth),
        F_SNAIVE = SNAIVE(Wealth ~ lag(2)),
        F_DRIFT = RW(Wealth ~ drift())
    )

Compute the accuracy of your forecasts. Which method does best?

Which method did best?

accuracy(fc, tt_split$test) %>% 
    arrange(RMSE)
## # A tibble: 4 × 11
##   .model   Country   .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE    ACF1
##   <chr>    <chr>     <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>
## 1 F_DRIFT  Australia Test   29.1  35.5  29.1  7.23  7.23   NaN   NaN  0.210 
## 2 F_NAIVE  Australia Test   34.7  41.5  34.7  8.64  8.64   NaN   NaN  0.216 
## 3 F_MEAN   Australia Test   35.7  42.3  35.7  8.89  8.89   NaN   NaN  0.216 
## 4 F_SNAIVE Australia Test   52.3  56.4  52.3 13.3  13.3    NaN   NaN -0.0138

Do the residuals from the best method resemble white noise?

No, but the good thing is that errors are unocorrelated.

10 Example 10

  • Create a training set for Australian takeaway food turnover (aus_retail) by withholding the last four years as a test set.
  • Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.
  • Compute the accuracy of your forecasts. Which method does best?
  • Do the residuals from the best method resemble white noise?
## # A tibble: 4 × 10
##   .model   .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1
##   <chr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 F_DRIFT  Test   19.3  66.5  45.0  1.02  2.63   NaN   NaN -0.200
## 2 F_SNAIVE Test   34.2  70.7  39.8  1.94  2.29   NaN   NaN -0.116
## 3 F_NAIVE  Test   27.4  71.9  46.6  1.52  2.71   NaN   NaN -0.173
## 4 F_MEAN   Test  910.  913.  910.  55.3  55.3    NaN   NaN -0.173

No, series do not resemble white noise.

11 Example 11

11.1 TODO!!!

We will use the Bricks data from aus_production (Australian quarterly clay brick production > 1956–2005) for this exercise.

  1. Use an STL decomposition to calculate the trend-cycle and seasonal indices. (Experiment with having fixed or changing seasonality.)

  1. Compute and plot the seasonally adjusted data.
  1. Use a naïve method to produce forecasts of the seasonally adjusted data.
  1. Use decomposition_model() to reseasonalise the results, giving forecasts for the original data.
  1. Do the residuals look uncorrelated?
  1. Repeat with a robust STL decomposition. Does it make much difference?
  1. Compare forecasts from decomposition_model() with those from SNAIVE(), using a test set comprising the last 2 years of data. Which is better?

12 Example 12

12.1 TODO!!!